Coarse Word-Sense Disambiguation Using Common Sense

نویسندگان

  • Catherine Havasi
  • Robert Speer
  • James Pustejovsky
چکیده

Coarse word sense disambiguation (WSD) is an NLP task that is both important and practical: it aims to distinguish senses of a word that have very different meanings, while avoiding the complexity that comes from trying to finely distinguish every possible word sense. Reasoning techniques that make use of common sense information can help to solve theWSD problem by taking word meaning and context into account. We have created a system for coarse word sense disambiguation using blending, a common sense reasoning technique, to combine information from SemCor, WordNet, ConceptNet and Extended WordNet. Within that space, a correct sense is suggested based on the similarity of the ambiguous word to each of its possible word senses. The general blending-based system performed well at the task, achieving an f-score of 80.8% on the 2007 SemEval Coarse Word Sense Disambiguation task. Common Sense for Word Sense Disambiguation When artificial intelligence applications deal with natural language, they must frequently confront the fact that words with the same spelling can have very different meanings. The task of word sense disambiguation (WSD) is therefore critical to the accuracy and reliability of natural language processing. The problem of understanding ambiguous words would be greatly helped by understanding the relationships between the meanings of these words and the meaning of the context in which they are used – information that is largely contained in the domain of common sense knowledge. Consider, for example, the word bank and two of its prominent meanings. In one meaning, a bank is a business institution where one would deposit money, cash checks, or take out loans: “The bank gave out fewer loans since the recession.” In the second, the word refers to the edges of land around the river, such as in “I sat by the bank with my grandfather, fishing.” We can use common sense to understand there would not necessarily be loans near a river, and rarely would fishing take place in a financial institution. We know that a money bank is different from a river bank because they have Copyright c © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. different common-sense features, and those features affect the words that are likely to appear with the word bank. In developing the word sense disambiguation process that we present here, our aim is to use an existing technique, called “blending” (Havasi et al. 2009), that was designed to integrate common sense into other applications and knowledge bases. Blending creates a single vector space that models semantic similarity and associations from several different resources, including common sense. We use generalized notions of similarity and association within that space to produce disambiguations. Using this process, instead of introducing a new and specialized process for WSD, will help to integrate disambiguation into other systems that currently use common sense. Coarse-Grained Word Sense Disambiguation A common way to evaluate word sense disambiguation systems is to compare them to gold standards created by human annotators. However, many such corpora suffer low interannotator agreement: they are full of distinctions which are difficult for humansto judge, at least from the documentation (i.e. glosses) provided. As a solution to this, the coarse word sense disambiguation (Coarse WSD) task was introduced by the SemEval evaluation exercise. In the coarse task, the number of word senses has been reduced. In Figure 1 we can see this simplification. Coarse word senses allow for higher inter-annotator agreement. In the fine-grained Senseval-3 WSD task, there was an inner-annotator agreement of 72.5% (Snyder and Palmer 2004); this annotation used expert lexicographers. The Open Mind Word Expert task used untrained internet volunteers for a similar task1(Chklovski and Mihalcea 2002) and received an inter-annotator agreement score of 67.3%. These varying and low inter-annotator agreement scores call into question the relevance of fine-grained distinctions. The Coarse Grained Task SemEval 2007 Task 7 was the “Coarse-Grained English AllWords Task” (Navigli and Litkowski 2007) which examines the traditional WSD task in a coarse-grained way, run by The Open Mind project is a family of projects started by David Stork, of which Open Mind Common Sense is a part. Thus Open Mind Word Expert is not a part of OMCS. 46 Commonsense Knowledge: Papers from the AAAI Fall Symposium (FS-10-02)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge

Word sense disambiguation is the process of determining which sense of a word is used in a given context. Due to its importance in understanding semantics of natural languages, word sense disambiguation has been extensively studied in Computational Linguistics. However, existing methods either are brittle and narrowly focus on specific topics or words, or provide only mediocre performance in re...

متن کامل

Merging Word Senses

WordNet, a widely used sense inventory for Word Sense Disambiguation(WSD), is often too fine-grained for many Natural Language applications because of its narrow sense distinctions. We present a semi-supervised approach to learn similarity between WordNet synsets using a graph based recursive similarity definition. We seed our framework with sense similarities of all the word-sense pairs, learn...

متن کامل

GPLSI: Word Coarse-grained Disambiguation aided by Basic Level Concepts

We present a corpus-based supervised learning system for coarse-grained sense disambiguation. In addition to usual features for training in word sense disambiguation, our system also uses Base Level Concepts automatically obtained from WordNet. Base Level Concepts are some synsets that generalize a hyponymy sub–hierarchy, and provides an extra level of abstraction as well as relevant informatio...

متن کامل

Combining Contextual Features for Word Sense Disambiguation

In this paper we present a maximum entropy Word Sense Disambiguation system we developed which performs competitively on SENSEVAL-2 test data for English verbs. We demonstrate that using richer linguistic contextual features significantly improves tagging accuracy, and compare the system’s performance with human annotator performance in light of both fine-grained and coarse-grained sense distin...

متن کامل

Combining ConceptNet and WordNet for Word Sense Disambiguation

Knowledge-based Word sense Disambiguation (WSD) methods heavily depend on knowledge. Therefore enriching knowledge is one of the most important issues in WSD. This paper proposes a novel idea of combining WordNet and ConceptNet for WSD. First, we present a novel method to automatically disambiguate the concepts in ConceptNet; and then we enrich WordNet with large amounts of semantic relations f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010